You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a draft PR of some work I did to measure compile times. I've had it sitting around for awhile so decided to draft PR it, just in case someone could use it.
The general idea is that Persistent suffers from slow compile times, because its template Haskell code generates very large amounts of code. This causes issues both in development and production for users, especially because Persistent models are likely a fairly "root" dependency in many codebases.
There are a number of changes Persistent could potentially make to reduce compile times, for example deriving fewer instances for keys.
This PR takes the approach of having a sample project that primarily consists of a large .persistentmodels file, which is one of the files Mercury uses in production (with modifications). We benchmark compiling it in two ways:
We use the bench CLI program, which is a wrapper around criterion. This gives us the usual benefits of criterion like statistical measurements in our benchmarks. The downside it it measures the full compilation time, not just desired module.
We compile this project with -ddump-timings and ddump-to-file when benchmarking, then on each build of the project, copy the file that has the timings for our models module to another directory. At the conclusion of the benchmarking, we use the timing files to get an average duration it took to compile our models.
Overall I think this is a good approach to benchmarking compilation time. It can be used with a variety of compiler settings (e.g. -O0 matters for development, but -O1 or -O2 for production). But it could use more sample projects that exercise different parts of Persistent (e.g. perhaps there is a performance degradation with models with 20+ fields—this current PR would not catch that).
As a small example, removing just the Servant typeclasses from being derived for keys shaves several hundred milliseconds off compilation. This is fairly significant given that an e.g. Yesod project has no need for these instances, and the result would be higher if done to more models files.
~/D/C/H/y/p/p/compile-time-testing> ruby add-timings.rb projects/Mercury/results-no-servant2 20:04:26
/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229: warning: Insecure world writable dir /usr/local/sbin in PATH, mode 040777
add-timings.rb:16: warning: assigned but unused variable - start
Looking for data in projects/Mercury/results-no-servant2
Mean is 9187.288562499998ms
~/D/C/H/y/p/p/compile-time-testing> ruby add-timings.rb projects/Mercury/results-no-servant 20:04:34
/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229: warning: Insecure world writable dir /usr/local/sbin in PATH, mode 040777
add-timings.rb:16: warning: assigned but unused variable - start
Looking for data in projects/Mercury/results-no-servant
Mean is 9106.216874999998ms
~/D/C/H/y/p/p/compile-time-testing> ruby add-timings.rb projects/Mercury/results-baseline/ 20:04:46
/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229: warning: Insecure world writable dir /usr/local/sbin in PATH, mode 040777
add-timings.rb:16: warning: assigned but unused variable - start
Looking for data in projects/Mercury/results-baseline/
Mean is 9485.424687499997ms
Changing the definition of persistFieldDef on every model to error "todo" decreased build time to 8669.350124999999ms on average (combined with the no servant instances speedup). Possibly a good candidate for a speedup by calling into entityDef -> entityFields -> lookup the field instead of embedding the field's definition. Might incur higher costs at runtime to lookup fields though
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a draft PR of some work I did to measure compile times. I've had it sitting around for awhile so decided to draft PR it, just in case someone could use it.
The general idea is that Persistent suffers from slow compile times, because its template Haskell code generates very large amounts of code. This causes issues both in development and production for users, especially because Persistent models are likely a fairly "root" dependency in many codebases.
There are a number of changes Persistent could potentially make to reduce compile times, for example deriving fewer instances for keys.
This PR takes the approach of having a sample project that primarily consists of a large
.persistentmodelsfile, which is one of the files Mercury uses in production (with modifications). We benchmark compiling it in two ways:benchCLI program, which is a wrapper around criterion. This gives us the usual benefits of criterion like statistical measurements in our benchmarks. The downside it it measures the full compilation time, not just desired module.-ddump-timingsandddump-to-filewhen benchmarking, then on each build of the project, copy the file that has the timings for our models module to another directory. At the conclusion of the benchmarking, we use the timing files to get an average duration it took to compile our models.Overall I think this is a good approach to benchmarking compilation time. It can be used with a variety of compiler settings (e.g. -O0 matters for development, but -O1 or -O2 for production). But it could use more sample projects that exercise different parts of Persistent (e.g. perhaps there is a performance degradation with models with 20+ fields—this current PR would not catch that).